Scientific High Performance Computing (HPC) Applications On The Azure Cloud Platform
نویسندگان
چکیده
Cloud computing is emerging as a promising platform for compute and data intensive scientific applications. Thanks to the on-demand elastic provisioning capabilities, cloud computing has instigated curiosity among researchers from a wide range of disciplines. However, even though many vendors have rolled out their commercial cloud infrastructures, the service offerings are usually only best-effort based without any performance guarantees. Utilization of these resources will be questionable if it can not meet the performance expectations of deployed applications. Additionally, the lack of the familiar development tools hamper the productivity of eScience developers to write robust scientific high performance computing (HPC) applications. There are no standard frameworks that are currently supported by any large set of vendors offering cloud computing services. Consequently, the application portability among different cloud platforms for scientific applications is hard. Among all clouds, the emerging Azure cloud from Microsoft in particular remains a challenge for HPC program development both due to lack of its support for traditional parallel programming support such as Message Passing Interface (MPI) and map-reduce and due to its evolving application programming interfaces (APIs). We have designed newer frameworks and runtime environments to help HPC application developers by providing them with easy to use tools similar to those known from traditional parallel and distributed computing environment setting, such as MPI, for scientific application development on the Azure cloud platform. It is challenging to create an efficient framework for any cloud platform, including the Windows Azure platform, as they are mostly offered to users as a black-box with a set of application programming interfaces (APIs) to access various service components. The primary contributions of this Ph.D. thesis are (i) creating a generic framework for bag-of-tasks HPC applications to serve as the basic building block for application development on the Azure cloud platform, (ii) creating a set of APIs for HPC application development over the Azure cloud platform, which is similar to message passing interface (MPI) from traditional parallel and distributed setting, and (iii) implementing Crayons using the proposed APIs as the first end-to-end parallel scientific application to parallelize the fundamental GIS operations. INDEX WORDS: Cloud computing, GIS computations using cloud platforms, Windows Azure cloud platform,Scientific applications over cloud platforms TITLE: SCIENTIFIC HIGH PERFORMANCE COMPUTING (HPC) APPLICATIONS ON THE AZURE CLOUD PLATFORM
منابع مشابه
Empirical Performance Analysis of HPC Benchmarks Across Variations in Cloud Computing
High Performance Computing (HPC) applications are scientific applications that require significant CPU capabilities. They are also data-intensive applications requiring large data storage. While many researchers have examined the performance of Amazon’s EC2 platform across some HPC benchmarks, an extensive study and their comparison between Amazon’s EC2 and Microsoft’s Windows Azure is largely ...
متن کاملScalable Parallel Scientific Computing Using Twister4Azure
Recent advances in data intensive computing for science discovery are fueling a dramatic growth in use of data-intensive iterative computations. The utility computing model introduced by cloud computing combined with the rich set of cloud infrastructure and storage services offers a very attractive environment for scientists to perform data analytics. The challenges to large-scale distributed c...
متن کاملTowards an MPI-like Framework for Azure Cloud Platform
Message passing interface (MPI) has been widely used for implementing parallel and distributed applications. The emergence of cloud computing offers a scalable, fault-tolerant, on-demand alternative to traditional on-premise clusters. In this thesis, we investigate the possibility of adopting the cloud platform as an alternative to conventional MPI-based solutions. We show that cloud platform c...
متن کاملRapid Processing of Synthetic Seismograms Using Windows
Currently, numerically simulated synthetic seismograms are widely used by seismologists for seismological inferences. The generation of these synthetic seismograms requires large amount of computing resources, and the maintenance of these observed seismograms requires massive storage. Traditional high-performance computing platforms is inefficient to handle these applications because rapid comp...
متن کاملData Replication-Based Scheduling in Cloud Computing Environment
Abstract— High-performance computing and vast storage are two key factors required for executing data-intensive applications. In comparison with traditional distributed systems like data grid, cloud computing provides these factors in a more affordable, scalable and elastic platform. Furthermore, accessing data files is critical for performing such applications. Sometimes accessing data becomes...
متن کامل